Audio Classification in Speech and Music: A Comparison between a Statistical and a Neural Approach

نویسندگان

Alessandro Bugatti

Alessandra Flammini

Pierangelo Migliorati

چکیده

We focus the attention on the problem of audio classification in speech and music for multimedia applications. In particular, we present a comparison between two different techniques for speech/music discrimination. The first method is based on zero crossing rate and Bayesian classification. It is very simple from a computational point of view, and gives good results in case of pure music or speech. The simulation results show that some performance degradation arises when the music segment contains also some speech superimposed on music, or strong rhythmic components. To overcome these problems, we propose a second method, that uses more features, and is based on neural networks (specifically a multi-layer Perceptron). In this case we obtain better performance, at the expense of a limited growth in the computational complexity. In practice, the proposed neural network is simple to be implemented if a suitable polynomial is used as the activation function, and a real-time implementation is possible even if low-cost embedded systems are used.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

شناسایی خودکار سبک موسیقی

Nowadays, automatic analysis of music signals has gained a considerable importance due to the growing amount of music data found on the Web. Music genre classification is one of the interesting research areas in music information retrieval systems. In this paper several techniques were implemented and evaluated for music genre classification including feature extraction, feature selection and m...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

Music Training Program: A Method Based on Language Development and Principles of Neuroscience to Optimize Speech and Language Skills in Hearing-Impaired Children

Introduction: In recent years, music has been employed in many intervention and rehabilitation program to enhance cognitive abilities in patients. Numerous researches show that music therapy can help improving language skills in patients including hearing impaired. In this study, a new method of music training is introduced based on principles of neuroscience and capabilities of Persian languag...

متن کامل

Audio Features for Noisy Sound Segmentation

Automatic audio classification usually considers sounds as music, speech, silence or noise, but works about the noise class are rare. Audio features are generally specific to speech or music signals. In this paper, we present a new audio feature sets that lead to the definition of four classes: colored, pseudo-periodic, impulsive and sinusoids within noises. This classification relies on works ...

متن کامل

Functional Brain Response to Emotional Muical Stimuli in Depression, Using INLA Approach for Approximate Bayesian Inference

Introduction: One of the vital skills which has an impact on emotional health and well-being is the regulation of emotions. In recent years, the neural basis of this process has been considered widely. One of the powerful tools for eliciting and regulating emotion is music. The Anterior Cingulate Cortex (ACC) is part of the emotional neural circuitry involved in Major Depressive Disorder (MDD)....

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

EURASIP J. Adv. Sig. Proc.

دوره 2002 شماره

صفحات -

تاریخ انتشار 2002

Audio Classification in Speech and Music: A Comparison between a Statistical and a Neural Approach

نویسندگان

چکیده

منابع مشابه

شناسایی خودکار سبک موسیقی

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Music Training Program: A Method Based on Language Development and Principles of Neuroscience to Optimize Speech and Language Skills in Hearing-Impaired Children

Audio Features for Noisy Sound Segmentation

Functional Brain Response to Emotional Muical Stimuli in Depression, Using INLA Approach for Approximate Bayesian Inference

عنوان ژورنال:

اشتراک گذاری